The Winningest Methods in Competition (M4)¶

Reported by: Charls Chua
November 08, 2022

Flow of Discussion:¶

I.) Background of M4¶

II.) Benchmarks¶

III.) Accuracy Measures¶

IV.) Performance of Submissions¶

V.) Top Performers¶

VI.) 7 Major Findings¶

¶

BACKGROUND OF M4 FORECASTING COMPETITION¶

M4 FORECASTING COMPETITION¶

Started: November 2017¶

Ended: May 31, 2018¶

Aim:¶

Replicate and extend the three previous competition by:
a.) Significantly increasing the number of series
b.) Expand the number of forecasting methods
c.) Include prediction intervals in the evaluation process as well as point forecasts.

Also introduced in M4:¶

i.) More data 
ii.) Emphasis on the reproducibility of the results
iii.) Incorporation of a vast number of diverse series and benchmarks

¶

DATASET¶

In [74]:
from IPython.display import Image
Image(filename='table1.png', width="1500")
Out[74]:
In [51]:
Image(filename='2testing.png', width="1200")
Out[51]:
In [52]:
Image(filename='2serieslengthstat.png', width="1200")
Out[52]:

¶

In [76]:
## Loading from the various library for data visualization
import os
import numpy as np
from matplotlib import pyplot
import pandas as pd
from pandas.plotting import autocorrelation_plot
from sklearn.preprocessing import Normalizer
from sklearn.preprocessing import MinMaxScaler
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LinearRegression ,BayesianRidge
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.kernel_ridge import KernelRidge
from sklearn.gaussian_process.kernels import RBF, ConstantKernel as C
from sklearn.tree import DecisionTreeRegressor
from sklearn.neighbors import KNeighborsRegressor
from sklearn.svm import SVR
from sklearn.svm import LinearSVR
from sklearn.linear_model import Ridge, RidgeCV, Lasso, LassoCV
from sklearn.ensemble import RandomForestRegressor
from sklearn.naive_bayes import GaussianNB
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.neural_network import MLPRegressor
from math import sqrt
from sklearn.neural_network import MLPRegressor
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from keras.layers import Embedding
from keras.models import Sequential
from keras.layers import Dense, SimpleRNN
from tensorflow.keras.optimizers import RMSprop
from keras import backend as ker
import tensorflow as tf
os.chdir('/home/phd/cachua/ATSA')
os.getcwd()
Out[76]:
'/home/phd/cachua/ATSA'

¶

BENCHMARKS¶

In [15]:
Image(filename='table2.png', width = "1700") 
Out[15]:

¶

ACCURACY MEASURES:¶

Peformance Measure for Point Forecast (PFs)¶

  • measure of single point estimates of its expected value

symmetric Mean Absolute Percentage Error (sMAPE; Makridakis, 1993)¶


- is an accuracy measure based on percentage (or relative) errors.
- Relative error is the absolute error divided by the magnitude of the exact value.
- In contrast to the mean absolute percentage error, SMAPE has both a lower bound and an upper bound.
- is percentage-based, it's scale-independent, which means that it can be used to compare forecast performances between datasets.
- limitation to SMAPE is that if the actual value or forecast value is 0, the value of error will approach 100%.
- The lower the SMAPE value of a forecast, the higher its accuracy.

In [13]:
Image(filename='sMAPE.png') 
Out[13]:

Mean Absolute Scaled Error (MASE; Hyndman & Koehler, 2006)¶


- Is the mean absolute error of the forecast values, divided by the mean absolute error of the naive forecast, where the naive forecast uses a previous value or an average of several previous points.
- is a scale-free error metric.

- You can use MASE to compare forecast methods on a single series and also to compare forecast accuracy between series.
- is a very good metric to use unless all of the historical observations are equal.
- If they're equal the observations display a straight line and the MASE result would be infinite or undefined.

In [77]:
Image(filename='MASE.png', width="600") 
Out[77]:

Where:
Yt is the value of the time series at point t
Ŷt is the estimated forecast
h is the forecast horizon
n is the number of data points available in the sample
m is the time interval between successive observations considered by the organizers for each data frequency
i.e. (12 for monthly, 4 for quarterly, 24 for hourly, 1 for yearly, 1 for daily data)

OWA (Overall weighted average)¶

OWA of sMAPE and MASE is calculated by first dividing their total value by the corresponding value of Naïve 2 to obtain the relative sMAPE and the relative MASE, respectively, and then computing their simple arithmetic mean.

Example: if Method X displays a MASE of 1.6 and an sMAPE of 12.5% across the 100,000 series of M4,

while Naïve 2 displays a MASE of 1.9 and an sMAPE of 13.7%, 

the relative MASE of Method X would be equal to 1.6/1.9 = 0.84 
and the relative and sMAPE for method X would be equal to 12.5/13.7 = 0.91,

resulting in an OWA of (0.84 + 0.91)/2 = 0.88, 

which indicates that, on average, the method examined is about 12% more accurate than Naïve 2, 

Peformance Measures for Prediction Intervals (PIs)¶

  • used to provide a range where the forecast is likely to be with a specific degree of confidence.


For example, if you made 100 forecasts with 95% confidence, you would have 95 out of 100 forecasts fall within the prediction interval.

Mean Scaled Interval Score (MSIS) of Gneiting and Raftery (2007)¶

In [9]:
Image(filename='MSIS.png', width="1000") 
Out[9]:

where:
Lt is the lower bound of the prediction interval
Ut is the upper bound of the prediction interval
Yt are the future observations of the series
a is the significance level
1 is the indicator function
(being 1 if Yt is within the postulated interval and 0 otherwise).
Since the forecasters were asked to generate 95% prediciton intervals, a is set to 0.05

¶

BENCHMARK ACCURACIES¶

In [16]:
Image(filename='table6.png', width="2000") 
Out[16]:

PERFORMANCE OF SUBMISSIONS¶

There were 248 participants who were registered to participate in the M4 Competition¶

Only 49 individuals/teams submitted valid PFs for all 100,000 time series.¶


10 benchmarks
2 standards for comparison

Only 20 forecasters providing valid PIs for all 100,000 alongside their PFs.¶


1 benchmark
2 standards for comparison

In [19]:
Image(filename='table4.png', width="1800") 
Out[19]:
In [20]:
Image(filename='table4cont.png', width="1800") 
Out[20]:
In [21]:
Image(filename='table5.png', width="1800") 
Out[21]:

Winner: Smyl Slawek¶

Paper: A hybrid method of exponential smoothing and recurrent neural networks for time series forecasting¶

Method used: ESRNN (Exponential Smoothing - Recurrent Neural Network) using advanced LSTM (Long Short Term Memory) neural network

The ES equations enables the method to effectviely capture the main components of the individual series such as seasonality and level

While the LSTM networks allow non-linear trend and cross-learning

Hybrid forecasting approach has the following three main elements:
i. Deseasonalization and adoptive normalization (ES)
a. no time stamp made it difficult to detect seasonality patterns
ii. Generation of forecasts
iii. Ensembling

In [42]:
## Loading Dataset Hourly-train
df_hourly = pd.read_csv("Hourly-train.csv", skiprows=0, index_col =0)

hourly_train_Other = df_hourly

Dataset_hourly = hourly_train_Other.T.iloc[:,:]
print ("Dimension:  ",Dataset_hourly.shape,"\n")

## The following are visualized hourly data
print("\n","Visualisation of 50 first Series:" )
Sub_Dataset = Dataset_hourly.iloc[:,0:50]
Sub_Dataset.plot(figsize=(20, 8), legend = None)
#Sub_Dataset.plot.density()
pyplot.show()
Dimension:   (960, 414) 


 Visualisation of 50 first Series:
In [43]:
## Plotting smyl Hourly-forecast
df_smyl = pd.read_csv("Hourly-train_smyl.csv", skiprows=0, index_col =0)

hourly_train_smyl = df_smyl

Dataset_hourly_smyl = hourly_train_smyl.T.iloc[:,:]

## The following are visualized hourly data
print("\n","Visualisation of 50 first Series:" )
Sub_hourly_smyl = Dataset_hourly_smyl.iloc[:,0:50]

Sub_hourly_smyl.plot(figsize=(20, 8), legend = None)
#df_smyl.plot.density()
pyplot.show()
 Visualisation of 50 first Series:

48 forecast points were plotted to continue the train data set.

In [45]:
## Loading Dataset Daily-train
df_daily = pd.read_csv("Daily-train.csv", skiprows=0, index_col =0)

daily_train_Finance = df_daily[2035:3594].T
daily_train_Other = df_daily[3594:4227].T
daily_train_Industry = df_daily[1613:2035].T
daily_train_Demographic = df_daily[1603:1613].T
daily_train_Micro = df_daily[127:1603].T
daily_train_Macro = df_daily[:127].T

Dataset_daily = df_daily.T.iloc[:,:]
print ("Dimension:  ",Dataset_daily.shape,"\n")

## The following are visualized daily data
print("\n","Visualisation of 50 first Series:" )
Sub_Dataset = Dataset_daily.iloc[:,0:50]
Sub_Dataset.plot(figsize=(20, 8), legend = None)
#Sub_Dataset.plot.density()
pyplot.show()
Dimension:   (9919, 4227) 


 Visualisation of 50 first Series:
In [46]:
## Plotting smyl daily-forecast
df_smyl = pd.read_csv("Daily-train_smyl.csv", skiprows=0, index_col =0)

daily_train_smyl = df_smyl

Dataset_daily_smyl = daily_train_smyl.T.iloc[:,:]

## The following are visualized hourly data
print("\n","Visualisation of 50 first Series:" )
Sub_daily_smyl = Dataset_daily_smyl.iloc[:,0:50]

Sub_daily_smyl.plot(figsize=(20, 8), legend = None)
#df_smyl.plot.density()
pyplot.show()
 Visualisation of 50 first Series:

14 forecast points were plotted to continue the train data set.

In [47]:
## Loading Dataset Weekly-train
df_weekly = pd.read_csv("Weekly-train.csv", skiprows=0, index_col =0)

Weekly_train = df_weekly
Weekly_train_Finance = df_weekly[59:223].T.iloc[:,:]
Weekly_train_Other = df_weekly[:12].T.iloc[:,:]
Weekly_train_Industry = df_weekly[53:59].T.iloc[:,:]
Weekly_train_Demographic = df_weekly[223:247].T.iloc[:,:]
Weekly_train_Micro = df_weekly[247:359].T.iloc[:,:]
Weekly_train_Macro = df_weekly[12:53].T.iloc[:,:]

Dataset_weekly = Weekly_train.T.iloc[:,:]
print ("Dimension:  ",Dataset_weekly.shape,"\n")

## The following are visualized weekly data
print("\n","Visualisation of 50 first Series:" )
Sub_Dataset = Dataset_weekly.iloc[:,0:50]
Sub_Dataset.plot(figsize=(20, 8), legend = None)
#Sub_Dataset.plot.density()
pyplot.show()
Dimension:   (2597, 359) 


 Visualisation of 50 first Series:
In [48]:
## Plotting smyl Weekly-forecast
df_smyl = pd.read_csv("Weekly-train_smyl.csv", skiprows=0, index_col =0)

weekly_train_smyl = df_smyl

Dataset_weekly_smyl = weekly_train_smyl.T.iloc[:,:]

## The following are visualized hourly data
print("\n","Visualisation of 50 first Series:" )
Sub_weekly_smyl = Dataset_weekly_smyl.iloc[:,0:50]

Sub_weekly_smyl.plot(figsize=(20, 8), legend = None)
#df_smyl.plot.density()
pyplot.show()
 Visualisation of 50 first Series:

Removed some specific timeseries is plotted separately 13 forecast points were plotted to continue the train data set.

In [49]:
## Plotting smyl Weekly-forecast
df_smyl = pd.read_csv("Weekly-train_smyl2.csv", skiprows=0, index_col =0)

weekly_train_smyl = df_smyl

Dataset_weekly_smyl = weekly_train_smyl.T.iloc[:,:]

## The following are visualized hourly data
print("\n","Visualisation of 50 first Series:" )
Sub_weekly2_smyl = Dataset_weekly_smyl.iloc[:,0:50]

Sub_weekly2_smyl.plot(figsize=(20, 8), legend = None)
#df_smyl.plot.density()
pyplot.show()
 Visualisation of 50 first Series:

13 forecast points were plotted to continue the train data set.

In [50]:
## Plotting smyl Weekly-forecast
df_smyl = pd.read_csv("Weekly-train_smyl3.csv", skiprows=0, index_col =0)

weekly_train_smyl = df_smyl

Dataset_weekly_smyl = weekly_train_smyl.T.iloc[:,:]

## The following are visualized hourly data
print("\n","Visualisation of 50 first Series:" )
Sub_weekly3_smyl = Dataset_weekly_smyl.iloc[:,0:50]

Sub_weekly3_smyl.plot(figsize=(20, 8), legend = None)
#df_smyl.plot.density()
pyplot.show()
 Visualisation of 50 first Series:

13 forecast points were plotted to continue the train data set.

In [51]:
## Plotting smyl Weekly-forecast
df_smyl = pd.read_csv("Weekly-train_smyl4.csv", skiprows=0, index_col =0)

weekly_train_smyl = df_smyl

Dataset_weekly_smyl = weekly_train_smyl.T.iloc[:,:]

## The following are visualized hourly data
print("\n","Visualisation of 50 first Series:" )
Sub_weekly4_smyl = Dataset_weekly_smyl.iloc[:,0:50]

Sub_weekly4_smyl.plot(figsize=(20, 8), legend = None)
#df_smyl.plot.density()
pyplot.show()
 Visualisation of 50 first Series:

13 forecast points were plotted to continue the train data set.

In [52]:
## Plotting smyl Weekly-forecast
df_smyl = pd.read_csv("Weekly-train_smyl5.csv", skiprows=0, index_col =0)

weekly_train_smyl = df_smyl

Dataset_weekly_smyl = weekly_train_smyl.T.iloc[:,:]

## The following are visualized hourly data
print("\n","Visualisation of 50 first Series:" )
Sub_weekly5_smyl = Dataset_weekly_smyl.iloc[:,0:50]

Sub_weekly5_smyl.plot(figsize=(20, 8), legend = None)
#df_smyl.plot.density()
pyplot.show()
 Visualisation of 50 first Series:

13 forecast points were plotted to continue the train data set.

In [53]:
## Loading Dataset monthly-train
df_monthly = pd.read_csv("Monthly-train.csv", skiprows=0, index_col =0)

monthly_train_Finance = df_monthly[36736:47723].T
monthly_train_Other = df_monthly[47723:48000].T
monthly_train_Industry = df_monthly[26719:36736].T
monthly_train_Demographic = df_monthly[20991:26719].T
monthly_train_Micro = df_monthly[10016:20991].T
monthly_train_Macro = df_monthly[:10016].T

Dataset_monthly = df_monthly.T.iloc[:,:]
print ("Dimension:  ",Dataset_monthly.shape,"\n")

## The following are visualized monthly data
print("\n","Visualisation of 50 first Series:" )
Sub_Dataset = Dataset_monthly.iloc[:,0:50]
Sub_Dataset.plot(figsize=(20, 8), legend = None)
#Sub_Dataset.plot.density()
pyplot.show()
Dimension:   (2794, 48000) 


 Visualisation of 50 first Series:
In [54]:
## Plotting smyl Monthly-forecast
df_smyl = pd.read_csv("Monthly-train_smyl.csv", skiprows=0, index_col =0)

monthly_train_smyl = df_smyl

Dataset_monthly_smyl = monthly_train_smyl.T.iloc[:,:]

## The following are visualized hourly data
print("\n","Visualisation of 50 first Series:" )
Sub_monthly_smyl = Dataset_monthly_smyl.iloc[:,0:50]

Sub_monthly_smyl.plot(figsize=(20, 8), legend = None)
#df_smyl.plot.density()
pyplot.show()
 Visualisation of 50 first Series:

18 forecast points were plotted to continue the train data set.

In [55]:
## Loading Dataset Quarterly-train
df_quaterly = pd.read_csv("Quarterly-train.csv", skiprows=0, index_col =0)

quaterly_train_Finance = df_quaterly[17830:23135].T
quaterly_train_Other = df_quaterly[23135:24000].T
quaterly_train_Industry = df_quaterly[13193:17830].T
quaterly_train_Demographic = df_quaterly[11335:13193].T
quaterly_train_Micro = df_quaterly[5315:11335].T
quaterly_train_Macro = df_quaterly[:5315].T

Dataset_quaterly = df_quaterly.T.iloc[:,:]
print ("Dimension:  ",Dataset_quaterly.shape,"\n")

## The following are visualized quarterly data
print("\n","Visualisation of 50 first Series:" )
Sub_Dataset = Dataset_quaterly.iloc[:,0:50]
Sub_Dataset.plot(figsize=(20, 8), legend = None)
#Sub_Dataset.plot.density()
pyplot.show()
Dimension:   (866, 24000) 


 Visualisation of 50 first Series:
In [56]:
## Plotting smyl Quarterly-forecast
df_smyl = pd.read_csv("Quarterly-train_smyl.csv", skiprows=0, index_col =0)

quarterly_train_smyl = df_smyl

Dataset_quarterly_smyl = quarterly_train_smyl.T.iloc[:,:]

## The following are visualized hourly data
print("\n","Visualisation of 50 first Series:" )
Sub_quarterly_smyl = Dataset_quarterly_smyl.iloc[:,0:50]

Sub_quarterly_smyl.plot(figsize=(20, 8), legend = None)
#df_smyl.plot.density()
pyplot.show()
 Visualisation of 50 first Series:

8 forecast points were plotted to continue the train data set.

In [57]:
## Loading Dataset Yearly-train
df_yearly = pd.read_csv("Yearly-train.csv", skiprows=0, index_col =0)

yearly_train_Finance = df_yearly[15245:21764].T
yearly_train_Other = df_yearly[21764:23000].T
yearly_train_Industry = df_yearly[11529:15245].T
yearly_train_Demographic = df_yearly[10441:11529].T
yearly_train_Micro = df_yearly[3903:10441].T
yearly_train_Macro = df_yearly[:3903].T

Dataset_yearly = df_yearly.T
print ("Dimension:  ",Dataset_yearly.shape,"\n")

## The following are visualized yearly data
print("\n","Visualisation of 50 first Series:" )
Sub_Dataset = Dataset_yearly.iloc[:,0:50]
Sub_Dataset.plot(figsize=(20, 8), legend = None)
#Sub_Dataset.plot.density()
pyplot.show()
Dimension:   (835, 23000) 


 Visualisation of 50 first Series:
In [59]:
## Plotting smyl Yearly-forecast
df_smyl = pd.read_csv("Yearly-train_smyl.csv", skiprows=0, index_col =0)

yearly_train_smyl = df_smyl

Dataset_yearly_smyl = yearly_train_smyl.T.iloc[:,:]

## The following are visualized hourly data
print("\n","Visualisation of 50 first Series:" )
Sub_yearly_smyl = Dataset_yearly_smyl.iloc[:,0:50]

Sub_yearly_smyl.plot(figsize=(20, 8), legend = None)
#df_smyl.plot.density()
pyplot.show()
 Visualisation of 50 first Series:

6 forecast points were plotted to continue the train data set.

What did not work well?¶

The method generated accurate forecasts for most of the frequencies and especially for the monthly, yearly and quarterly ones.

However, the accuracy was sub-optimal for the case of the hourly, daily and weekly data.

This is partly explainable by the author’s concentration on the ”three big” subsets: monthly, yearly and quarterly, as they covered 95% of the data, and performing well on them was key to success in the Competition.

However, subsequent work on hourly, daily and weekly data confirmed that the under-performance is a real problem.

¶

1st Runner up: Montero-Manzo, et. al¶

Paper: FFORMA: Feature-based Forecast Model Averaging (M4metalearning)¶

The model Combines nine different well-known forecasting algorithms:

  1. ARIMA The ARIMA model. Parameters like the order of differencing, p, q are obtained through an exhaustive search. The code for fitting the model is forecast::auto.arima(x, stepwise=FALSE, approximation=FALSE)

  2. ETS Exponential smoothing state space model with parameters fitted automatically. The code for fitting the model isforecast::ets(x)

  3. NNETAR A feed-forward neural network with a single hidden layer is fitted to the lags. The number of lags is automatically selected. The code for fitting the model is forecast::nnetar(x)

  4. TBATS The Exponential smoothing state space Trigonometric, Box-Cox transformation, ARMA errors, Trend and Seasonal components model. Parameters like the application of a Box-Cox transformation, to include trend, etc. are automatically fitted. The code for fitting the model is forecast::tbats(x)

  5. STLM-AR Seasonal and Trend decomposition using Loess with AR modeling of the seasonally adjusted series. The code for fitting the model is forecast::stlm(x, modelfunction = stats::ar)

  6. RW-DRIFT Random Walk with Drift. The code for fitting the model is forecast::rwf(x, drift=TRUE)

  7. THETAF The theta method of Assimakopoulos and Nikolopoulos (2000). The code for fitting the model is forecast::thetaf(x)

  8. NAIVE The naive method, using the last observation of the series as the forecasts. The code for fitting the model is forecast::naive(x)

  9. SEASONAL NAIVE The forecast are the last observed value of the same season. The code for fitting the model is forecast::snaive(x)

TOP PERFORMING SUBMISSIONS¶

In [53]:
Image(filename='2top10perf.png', width="1200") 
Out[53]:
In [54]:
Image(filename='2forecastingattributes.png', width="1200") 
Out[54]:

EASIEST FOUR¶

The easiest four are the four timeseries (based on number) that the participants had submiited forecast where the errors were the least/minimum.

the top 10 performing methods were plotted against each other.

In [78]:
Image(filename='2top10.png') 
Out[78]:
In [82]:
os.chdir('/home/phd/cachua/ATSA/photos')
os.getcwd()
print("\n","Hourly % Error - Easiest Four" )
Image(filename='error-hourly-easy.png', width="1000") 
 Hourly % Error - Easiest Four
Out[82]:
In [59]:
Image(filename='2top10.png') 
Out[59]:
In [81]:
print("\n","Hourly % Error - Easiest Four" )
Image(filename='error-daily-easy.png', width="1000") 
Out[81]:
In [61]:
Image(filename='2top10.png') 
Out[61]:
In [83]:
print("\n","Weekly % Error - Easiest Four" )
Image(filename='error-weekly-easy.png', width="1000") 
 Weekly % Error - Easiest Four
Out[83]:
In [63]:
Image(filename='2top10.png') 
Out[63]:
In [84]:
print("\n","Monthly % Error - Easiest Four" )
Image(filename='error-monthly-easy.png', width="1000") 
 Monthly % Error - Easiest Four
Out[84]:
In [65]:
Image(filename='2top10.png') 
Out[65]:
In [85]:
print("\n","Quarterly % Error - Easiest Four" )
Image(filename='error-quarterly-easy.png', width="1000") 
 Quarterly % Error - Easiest Four
Out[85]:
In [67]:
Image(filename='2top10.png') 
Out[67]:
In [86]:
print("\n","Yearly % Error - Easiest Four" )
Image(filename='error-yearly-easy.png', width="1000") 
 Yearly % Error - Easiest Four
Out[86]:

¶

MOST DIFFICULT FOUR¶

The Difficult four are the four timeseries (based on number) that the participants had submiited forecast where the errors were the highest.

the top 10 performing methods were plotted against each other.

In [70]:
Image(filename='2top10.png') 
Out[70]:
In [87]:
print("\n","Hourly % Error - Difficult Four" )
Image(filename='error-hourly-hard.png', width="1000") 
 Hourly % Error - Difficult Four
Out[87]:
In [72]:
Image(filename='2top10.png') 
Out[72]:
In [88]:
print("\n","Daily % Error - Difficult Four" )
Image(filename='error-daily-hard.png', width="1000")
 Daily % Error - Difficult Four
Out[88]:
In [74]:
Image(filename='2top10.png') 
Out[74]:
In [89]:
print("\n","Weekly % Error - Difficult Four" )
Image(filename='error-weekly-hard.png', width="1000")
 Weekly % Error - Difficult Four
Out[89]:
In [76]:
Image(filename='2top10.png') 
Out[76]:
In [90]:
print("\n","Monthly % Error - Difficult Four" )
Image(filename='error-monthly-hard.png', width="1000")
 Monthly % Error - Difficult Four
Out[90]:
In [78]:
Image(filename='2top10.png') 
Out[78]:
In [91]:
print("\n","Quarterly % Error - Difficult Four" )
Image(filename='error-quarterly-hard.png', width="1000")
 Quarterly % Error - Difficult Four
Out[91]:
In [80]:
Image(filename='2top10.png') 
Out[80]:
In [92]:
print("\n","Yearly % Error - Difficult Four" )
Image(filename='error-yearly-hard.png', width="1000")
 Yearly % Error - Difficult Four
Out[92]:

¶

STEP ERROR PERCENTAGE¶

In [94]:
Image(filename='2top10.png', width="800") 
Out[94]:
In [96]:
Image(filename='2steperro1.png', width ='1000')
Out[96]:
In [97]:
Image(filename='2steperro2.png', width ='1000')
Out[97]:

Average step error percentage for the top-10 Methods and all categories. The x-axis represents the forecasting Horizon, and Negative Error indicates that the method forecast a value lower than the actual.

PERFORMANCE EVALUATION¶

In [86]:
Image(filename='2performanceevaluation.png', width="2000")
Out[86]:

M COMPETITION OUTPUT¶

In [102]:
Image(filename='table8updated.png', width="2000") 
Out[102]:

CONCLUSION¶

7 MAJOR FINDINGS OF THE M4 COMPETITION¶

7 MAJOR FINDINGS OF THE M4 COMPETITION:¶

1. The improved numerical accuracy of combining
2. The superiority of hybrid approach that utlizes both statistical and ML features
3. The Significant differences between the six top-performing methods and the rest in therms of PFs.
4. The improved precision of the PIs
5. More complex methods can possibly lead to a greater forecasting accuracy
6. Using information from multiple series to predict individual ones
7. Poor performance of the submitted pure ML methods.

1. The improved numerical accuracy of combining¶

2. The superiority of a hybrid appoach that utilizes both statistical and ML features¶

3. The Significant Differences between the six top-performing methods and the rest in terms of PFs.¶

In [14]:
Image(filename='figure2.png') 
Out[14]:

4. The Improved precision of PIs¶

5. More complex methods can possibly lead to a greater forecasting accuracy¶

In [27]:
Image(filename='table7.png', width="1200") 
Out[27]:

6. Using information from multiple series to predict individual ones.¶

For the first time, the top three performing methods of the M4, as measured by PFs, introduced information from multiple series (aggregated by data frequency) in order to decide on the most effective way of forecasting and/or selecting the weights for combining the various statistical/ML methods considered.

It must be investigated whether such information was part of the reason why these three methods managed to perform better that the remaining ones, and if so, to what extent.

7. The poor performance of the submitted pure ML methods¶

A total of five ML methods (four pure and one combination) were submitted in M4. All of them were less accurate than the Comb benchmark in terms of PFs, and only one was more accurate than the Naïve 2 method.

-END-¶

References:¶

The M4 Competition: 100,000 time series and 61 forecasting methods¶

  • by Spyros Makridakis , Evangelos Spiliotis and Vassilios Assimakopoulos
  • Code Source: https://github.com/Mcompetitions/M4-methods

Reproducibility of the Top-Performing Methods in the M4 Competition¶

  • by Maria Soleim, Odd Erik Gundersen

A Multi-Factor Analysis of Forecasting Methods: A Study on the M4 Competition¶

  • by Pantelis Agathangelou, Demetris Trihinas, Ioannis Katakis
  • Code Source: https://github.com/unic-ailab/m4-multi-factor-analysis